Analysis and summarization of correlations in data cubes

نویسندگان

  • Chien-Yu Chen
  • Shien-Ching Hwang
  • Yen-Jen Oyang
چکیده

This paper presents a novel mechanism to analyze and summarize the statistical correlations among the attributes of a data cube. To perform the analysis and summarization, this paper proposes a new measure of statistical significance. The main reason for proposing the new measure of statistical significance is to have an essential closure property, which is exploited in the summarization stage of the data mining process. In addition to the closure property, the proposed measure of statistical significance has two other important properties. First, the proposed measure of statistical significance is more conservative than the well-known chi-square test in classical statistics and, therefore, inherits its statistical robustness. This paper does not simply employ the chi-square test due to lack of the desired closure property, which may lead to a precision problem in the summarization process. The second additional property is that, though the proposed measure of statistical significance is more conservative than the chi-square test, for most cases, the proposed measure yields a value that is almost equal to a conventional measurement of statistical significance based on the normal distribution. Based on the closure property addressed above, this paper develops an algorithm to summarize the results from performing statistical analysis in the data cube. Though the proposed measure of statistical significance avoids the precision problem due to having the closure property, its conservative nature may lead to a recall rate problem in the data mining process. On the other hand, if the chi-square test, which does not have the closure property, was employed, then the summarization process may suffer a precision problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A survey on Automatic Text Summarization

Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...

متن کامل

Graph Hybrid Summarization

One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...

متن کامل

Text Summarization Using Cuckoo Search Optimization Algorithm

Today, with rapid growth of the World Wide Web and creation of Internet sites and online text resources, text summarization issue is highly attended by various researchers. Extractive-based text summarization is an important summarization method which is included of selecting the top representative sentences from the input document. When, we are facing into large data volume documents, the extr...

متن کامل

The Relationship between Emotional Intelligence and Mental Disorders with Internet Addiction in Internet Users University Students

Background: This study aimed to evaluate the relationship between emotional intelligence and mental disorders, with internet addiction in university students. Methods: The method of study was descriptive-pilot one and correlation. Two hundred internet users (male and female) from Isfahan University and Isfahan University of Technology were randomly selected. For data collection، Carson's emotio...

متن کامل

EXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS

Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002